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^ ABSTRACT 

Qh The first half of this paper explores the origin of systematic biases in the mea- 

Q surement of weak gravitational lensing. Compared to previous work, we expand the 

}—( investigation of PSF instability and fold in for the first time the effects of non-idealities 

in electronic imaging detectors and imperfect galaxy shape measurement algorithms. 
d Together, these now explain the additive A{£) and multiplicative A4{£) systematics 

typically reported in current lensing measurements. We find that overall performance 
is driven by a product of a telescope/camera's absolute performance, and our knowledge 
^ about its performance. 

The second half of this paper propagates any residual shear measurement biases 
(^\ through to their effect on cosmological parameter constraints. Fully exploiting the 

^3 statistical power of Stage IV weak knsing surveys will require additive biases A ^ 1.8 x 

10~^^ and multiplicative biases Ai ^ 4.0 x 10^"^. These can be allocated between 
individual budgets in hardware, calibration data and software, using results from the 
first half of the paper. 

^SJ If instrumentation is stable and well-calibrated, we find extant shear measurement 

T-H software from GREATIO already meet requirements on galaxies detected at S/N=40. 

L| Averaging over a population of galaxies with a realistic distribution of sizes, it also 

. ^ meets requirements for a 2D cosmic shear analysis from space. If used on fainter 

galaxies or for 3D cosmic shear tomography, existing algorithms would need calibration 
$-H on simulations to avoid introducing bias at a level similar to the statistical error. 

^ Requirements on hardware and calibration data are discussed in more detail in a 

companion paper. Our analysis is intentionally general, but is specifically being used 
to drive the hardware and ground segment performance budget for the design of the 
European Space Agency's recently-selected Euclid mission. 

Key words: gravitational lensing — cosmology: cosmological parameter — instru- 
mentation: detectors — methods: data analysis 
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1 INTRODUCTION 



Statistical measurements of weak gravitational lensing in a 
large sample of galaxies offer a direct way to probe the dark 
sector of the Universe (sec reviews by |Hoekstra fc Jain|2008] 



Massey, Kitching & Richard 2010| ). Gravitational lensing is 



the deflection of light from distant galaxies during its jour- 
ney to us, by an amount that depends on the intervening 
distribution of matter (including dark matter) and the ge- 
ometry of spacetime (which is currently governed by dark 
energy) . The deflection of light produces slight shear distor- 
tions in the galaxies' apparent shapes, and adjacent galaxies 
appear to line up in characteristic patterns across the sky. 

Galaxy ellipticities are typically distorted only a few 
percent by weak gravitational lensing. Detecting this tiny 
signal is difficult because the image shapes are also changed 
an order of magnitude more by convolution with the Point 
Spread Function (PSF) of the telescope, detector and atmo- 
sphere, as well as by distortion in the camera. These other 
effects must be modelled and corrected; even subtle residual 
contributions can significantly bias cosmological measure- 
ments. 

In the first half of this paper we explore three types of 
error that affect galaxy shape measurement: 

• inaccuracies in the model of the convolutional PSF, 
from which observed galajcy shapes must be deconvolved 
(this builds upon work by Paulin-Henriksson et al.||2008 l. 

• inaccuracies in correction for any effect that cannot be 
treated as a deconvolution. This includes detector effects 
such as Charge Transfer Inefficiency in CCDs or Inter-Pixel 
Capacitance in HgCdTe devices, which perturb pixel values 
in a nonlinear fashion. 

• inaccuracies in the measurement of galaxy shapes. Min- 
imising noise, particularly in faint galaxies, forces measure- 
ment methods to apply pixel weights which must subse- 
quently be undone. 

We propagate these measurement errors through a tomo- 
graphic cosmic shear analysis (theory developed by |Hu 1999[ 
|Jain fc T aylor 2003 , Bernstein & ,Jain 2004' and measure- 
ments obtained by Kitching ot al. 2007 , Massey et al. ,2007a, 



Schrabback et al. 2010 1 to determine the bias they induce 



upon constraints on the dark energy equation of state pa- 
rameter w ( |Song fc Kiim|[20041 [Simpson fc Bridle] [20^5} 
Ishak|2005[ ). 



In the second half of this paper, we establish require- 
ments on additive and multiplicative cosmic shear system- 
atics to meet future scientific goals. We also use our ear- 
lier results to consider how residual additive biases can 
be empirically identified and removed, and assess the im- 
pact of residual multiplicative biases that cannot be self- 
consistently identified within a data set. Our analysis is in- 
tentionally performed with a scope sufficiently general to 
cover any future Stage IV weak gravitational lensing survey. 
It is particularly motivated by, and drives the hardware and 
ground segment performance budget for the design of the 
European Space Agency's recently-selected Euclid mission 
I Laureijs et al.|26TT I . This work generalises the conclusions 
of |Amara, Refregier fc Paulin-Hcnriksson (2010). Du ring the 
final preparation of this paper, Chang et al. (20121 posted 



to the arxiv an analysis of future prospects for the Large 
Synoptic Survey Telescope (LSST). There is some overlap 
in ambition, but complementary methodology. Like Chang 



et al. we employ a bottom-up approach in this paper, propa- 
gating various instrumental imperfections through to errors 
on cosmological parameters. However, rather than simulat- 
ing the detailed performance of a baseline telescope model, 
we work analytically to build a general framework for prop- 
agating general system performance. In a companion paper, 



Cropper et al. (20121, this allows us to perform a top-down, 



systems engineering analysis: starting from the science re- 
quirements and fiowing down to requirements on subsystem 
performances. Using the understanding from this paper, the 
total error budget and mitigation can be sensibly allocated 
between individual budgets in hardware, calibration data 
and software performance. 

This paper is organised as follows. In Section [2] we de- 
fine the basic galaxy shape and cosmological quantities of 
interest that would be measured in a weak lensing experi- 
ment with no (or idealised) errors. In Section [s] we explore 
the various types of error that can be introduced during 
realistic galaxy shape measurement and may prevent recov- 
ery of the true signal. Our underlying approach builds upon 
the work of Paulin-Henriksson et al. ( 2008 1 - however the 



mathematical expressions rapidly lengthen when we intro- 
duce more sources of error. For clarity, we therefore choose 
to evolve the formalism in three stages, one for each source 
of error. In Section [4j we derive requirements on shear mea- 
surement biases for a cosmic shear survey seeking to measure 
dark energy. In Section [5] we determine whether those re- 
quirements are met by extant shear measurement software 
described in the literature. We do this at fixed galaxy fluxes 
and, using our results from the first half of the paper, aver- 
aging over the full population of galaxies that will be seen 
by a survey. We conclude in Section [5] 



2 IDEALISED WEAK LENSING 
MEASUREMENT 

2.1 Perfect shear measurement 

Many techniques have been developed to precisely measure 



the shapes of galaxies (see Bridle et al. 2010 Kitching et 
|al.| |201l| ). For the sake of a concrete example, we shall 
consider the generic class of methods based upon galax- 
ies' quadrupole moments. In a method based on unweighted 
quadrupole moments (see [Bartelmann fc Schneider 200l| , 
the shape of any localised object in a 2D image /(r, 9) can 
be quantified via its size 



R 



JJ I(r, 9) rArA9 
JJ I{r, 9) rArA9 

and complex ellipticity 



■ £i + i£2 



III' 



rdrd6' 



(1) 



(2) 



Gravitational lensing magnifies and shears a galaxy of 
intrinsic size i?int and ellipticity eint into one of size i?gai > 
and ellipticity 



(3) 



where the shear 'polarizability' P-y = dei^t/d'y ~ 1.86 is the 
amount by which the ellipticity of a galaxy changes during 
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gravitational lensing^ 




Kaiser, Squires & Broadhurst||l995| 


Luppino & Kaiser|1997 


1 . When this galaxy is imaged by any 



camera, it is convolved with a Point Spread Function (PSF, 
of size iipsF and ellipticity epsp), producing an observed 
source of larger size 



r)2 r»2 . r)2 

-ftobs — -ftgal + -ftpSF 

and perturbed ellipticity 

Sobs — Sgal + 



Rt 



(epSF — Egal) 



(4) 



(5) 



Weak lensing analyses observe the shape of each galaxy 
then try to correct it for (or deconvolve it from) the PSF, to 
recover the galaxy's true ellipticity. The system PSF can be 
measured from stars also within the field of view. Rearrang- 
ing equations (j4| and Q so that only observable quantities 
appear on the right hand side, the galaxy's ellipticity is 



Egal 



r2 



EPSF^PSF 



^obs 



p2 

-rtpgp 



(6) 



The intrinsic ellipticity of individual galaxies is uninterest- 
ing so, to isolate the cosmologically relevant information, 
ellipticity is normalised into a shear estimator 

7 = {P^r' Egal . (7) 

This ensures that, averaging over a large number of galaxies. 



(8) 



and we recover (7) — (7) so long as the intrinsic galaxy 
ellipticities are random and hence (eint) = (but see Crit- 



tenden et al.|2001||Catelan et al.|2001[|Natarajan et al.|2001 
Joachimi fc Schneider||2008| |Sclineider fc Bridle||2010[ [Kirk] 
et al.||201 2' for instances of 'intrinsic alignments' when this 
does not hold). 

The average shear is zero so, to compare to theoretical 
models, the measured shears are then combined into two- 
point correlation functions 



^+{e,ZA,ZB) = {~fA~f*B) {e,ZA,ZB), 
i^{e,ZA,ZB) = Re((747s) (e, 2A,2s)), 



(9) 
(10) 



where the angle brackets indicate averaging over all pairs of 
galaxies A and B in a survey that are at redshifts za and 
zb and separated on the sky by an angle 9, or within bins 



around those values (Crittenden et al. 112000 Bartelmann & 



[Schneider 200T| ). The correlation functions trace a cosmo- 
logical, 'cosmic shear' signal at 6 > 0. Results are often 
expressed in terms of the shear power spectrum C{1), the 
Fourier transform of a weighted sum of ^± (0) . 



^ Strictly, P-y depends on galaxy morphology and is also a 2 X 2 
tensor acting separately on the real and imaginary components 
of ellipticity. On average, however, it is very close to the identity 
tensor times a real scalar 2 — (le jntP) ( [Rhodes, Refregier fc Groth| 
|2000[ |. [Leauthaud et al.| l [2007^ show that (Isintl'^) is consistent 
with a constant value of 2 X 0.26^ for galaxies to at least redshift 
z = 2.6. We greatly simplify subsequent analysis by assuming 
scalar P-y 1.86. 

^ Relationships Q and ^ are exact using unweighted moments, 
but hold for some other methods only if both the galaxy and the 
PSF are approximately Gaussian. We shall return to this issue in 
Section 13.31 



zero-lag term it^(5(S=0) is added. This is included by 



If galaxy shapes are autocorrelated with them selves , a 

4 



Paulin- 



Henriksson et al. ( 2008| equation 11) but we disregard it 



because it can be readily avoided by excluding such galaxy 
pairs in practice. If the autocorrelation term were not 
removed from an analysis, it would be white noise, inde- 
pendent of scale in the Fourier transform. This must be 
marginalised over as an unknown constant of integration, 
subtracted from measurements, or added to theoretical mod- 
els. 



2.2 Parametric shear measurement bias 

Deviations from perfect shear measurement are commonly 
parameterised following the Shear TEsting Programme 
(STEP; [Hermans et al.][2006l passey et al.||2007b[ ) as 



7 = (1 -I- m)7 + c. 



(11) 



We shall henceforth represent all real- world, imperfect mea- 
surements using a hat. 

2.2.1 Constant shear measurement bias 

We first consider shear measurements that have small ad- 
ditive bias c with constant mean (c) and random noise CTc, 
plus small multiplicative bias m with constant mean (m) and 
random noise am. Pairs of these shear measurements can be 
folded through the calculation of a correlation function ([9| 
to produce 



^+{e,ZA,ZB) = {^A%) 

= <(H-m)(l + m))C+-f (jcO 



(12) 
(13) 



plus cross terms only in the presence of shear-dependent 
selection effects (see e.g. [Jain, Jarvis fc Bernstein|2006) and 
the discussion in Appendix A). 

Taking the Fourier transform to yields a power spec- 
trum spectrunj^ 



C{£, ZA, zb) = il+M) C{£, ZA, zb) + A, 

where 



(14) 
(15) 



M = 2{m) + {m^) = 2{m) + {m)'' + amS(0) (16) 
^ 2(m) . (17) 

We have expanded the mean squared error (m^) term but 
note that (m)^ <^ (m) and that Ai terms arise only from 
the correlation of galaxies with other galaxies. Thus a con- 
stant multiplicative bias in shear measurements leads to a 



^ [Amara fc Refregier] |2008[ eqn. 13) rearrange ( |12| as a Taylor 
series expansion of the measured correlation function 

C(£) = Cii) + l^Ao + ^1 C(£) + . . . } 

and label everything inside the curly brackets as different types 
of 'additive error' C^^". Simple multiplicative biases easily arise, 
so we instead find it helpful to keep terms A and M separate. 
Because of the shape of the ACDM cosmological power spectrum, 
they are nearly orthogonal and have quite different implications 
(see Section [4]l. We therefore restrict our notation for A to refer 
solely to pure additive terms. 
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similarly constant multiplicative bias in a measurements of 
the shear power spectrum. If the autocorrelation terms dis- 



cussed in Section 2.1 1 are included, equation (15 1 gains an 
additional white noise term 



= [1 + 2{m) + (m^)] P^^a^., + ol 



(18) 



This notation is also discussed in Kitching et al. (2012a I 



2.2.2 Spatially /temporally varying shear measurement 
bias 

We next consider shear measurement biases in which c and 
m vary from galaxy to galaxy, and their deviations from a 
mean value can be correlated in patterns across a survey. In 
general, A and A4. become functions of scale, orientation on 
the sky, and redshift 

C{i, ZA,ZB)=Y.{l+M{e.,i',ZA, Zb))C(1', ZA,Zb) 

+A{e,ZA,ZB) (19) 



(see Appendix A of Kitching et al.|2012a I . The additive sys- 
tematics A now include a contribution from the spatially 
varying additive shear measurement bias. The M matrices 
mix power from different scales, as well as physical i?-mode 
and non-physical B-mode signals, where C = Ce + iCs. 
Anisotropic errors could arise from PSF terms in off-axis 
cameras, from non-square pixels, from some detector effects, 
or in ground-based surveys where gravity loading and the 
prevailing wind can impose preferred directions. In this pa- 
per, we shall only consider the simpler situation in which the 
systematic errors are isotropic on average within a survey. 
In this case, the matrices are diagonal, so A and M become 
functions of only scale and redshift 

C{1,Za,Zb) = [1 + M{i, ZA, Zb))C{1, ZA, Zb)+ A{1, ZA, Zb). 

(20) 

Using the notation o-^fx] to represent the covariance about 
the mean of error 5x in pairs of galaxies separated by > 0, 
we find 



A{1, ZA, zb) = a'^[\c\]{£, ZA, zb) 



M{1, ZA, Zb) ^ C7^[m]{£, za, zb) + 2(m)(zA, zb). 



(21) 



(22) 



[Paulin-Henriksson et al.| ( |2008[ ) miss the second half of equa- 
tion ( 22 1 because they ignore bias terms when expanding 



mean squared errors. This was reasonable for purely addi- 
tive systematics, as spatially constant terms disappear dur- 
ing a Fourier transform; but in this case we judge that the 
bias term is likely to be the most problematic. Instead, we 
find that cosmic shear biases arise from a combination of 
(a) absolute biases in shear measurement and (b) uncer- 
tainty in or lack of knowledge about shear measurements. 
This dichotomy will emerge as a general result throughout 
Section [S] 



3 REALISTIC WEAK LENSING 
MEASUREMENT ERRORS 

3.1 Imperfect PSF correction 

Errors in shear measurement can arise from several sources. 
For example, our model of the PSF will inevitably be im- 
perfect because it is obtained from noisy stars and must 
be interpolated to the position and colour of each galaxy 
Hoekstra||2004! 'Mandelbaum et al. 



{e.g. 



2009 



2005 



Jain, Jarvis 



& Bernstein 2006; Paulin-Henriksson, Refregier & Amara 



Cypriano et al.|2010 ) 



Via a first order Taylor series expansion of equation (|6|, 
model errors in the PSF size S{Rpgp) and ellipticity depsp 
propagate into an imperfect estimate of the galaxy ellipticity 



2 N C^Sgal r 

psf) + oepsp. 



The partial derivatives of (|6| are 



(23) 



de 



gal 



d{R 



2 

PSP/ 



(^obs ^PSp) 



r(eobs — epsp) 



Egal — epsF 



Rl 



r/2 

-"-obs 



R2 

ripgp 



-Rgal 



(24) 



(25) 



and the derivative with respect to the other real/imaginary 
component of the PSF ellipticity is zero. In equations ( 24 1 



and ( 25 1 , the first equality is expressed in terms that are 



observable in an image, and the second equality reflects fun- 
damental source properties. Inserting the latter into ( 23 1 
yields 



Sgal 



1 + 



-^gal 



'^(■Rpsp) 

"gal 



^epsp + 



Egal - 



ml 



-EPSP 



(26) 



Arranged thus (c./. Paulin-Henriksson et al. 2008'eqn. 8), the 
last two terms display an elegant symmetry: the product of 
the PSF size and our knowledge of its ellipticity, then its 
absolute ellipticity and our knowledge of its size. The STEP 
parameters can be easily read off from this expression. Note 
that if the PSF ellipticity is known perfectly ((5epsF = 0), 
c — —mepsF / P-y and the two are related. 

When folding this imperfect shear estimator through 
the calculation of a correlation f unction ([9|, to multiply 
out some angle brackets we follow [Paulin-Henriksson et al.| 
(2008) in assuming that inaccuracies in the model of the PSF 
shape are independent of the shape of the PSF and the size 
of galaxies to which it is applied. If that does not hold, the 
angle brackets cannot be separated and some cross-terms 
can be introduced that are computed in Appendix A. An 



additional assumption that Paulin-Henriksson et al. ( 2008 1 



and we make is that the size of the PSF is roughly constant 



In 



across the survey, such that (iZ|gpiZ|gp) {£) 
exact correspondence to the various terms of equations (|21 1 



{RpSF/ 
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and ( 22 1 , we find 



A{t, ZA, Zb) = 

(IepsfI^ 



p 2 \ d4 



(If^epspO {(.,za,zb) 



P 2 



J? 



PSF 



\5{Rl 



M{e,ZA,ZB) 



ni 

-"gal 



\5{R 



psf) 



+2 



^PSP 



(Rpsp) 
{5{R 



(e, za,zb) 



(I, Za,Zb) 



(27) 



psf)) 



(Rpsp) 



{za,zb). (28) 



In combination, this reproduces eqn. (11) of Paufin- 



Henriksson et al. ( 2008 1, except for the autocorrelation term 
now intentionally omitted from ( |27[) (see Section |2.1[) a nd 
the linear term now appended to ( 28 1 (see Section y2.2.2 1 . 

We shall expand the (scale-dependent) mean squared 
error terms that reflect a measurement error, like {|<5epsFp), 
into a bias (l^epspl)^, plus a covariance about the mean 
^■^[lepSFl]- For the sake of legibility, we do not likewise ex- 
pand the mean squared error terms on instrument perfor- 
mance, such as (IepsfI'^), but the split is implicit. For legi- 
bility, we also omit the notation showing functional depen- 
dence on scale and redshift, but note that all bias terms 
are functions of {za,zb) and all covariances are functions 
of {i, ZA, Zb)- Indeed, since (-Rgai) scales with redshift, ev- 
ery term really will vary as a function of redshift. To second 
order in 5, we find 



A- 



P 2 
1 J 



Rpc 



R* 

-"gal 



+ ■ 



IepsfI^) 



Rt,i 



EPSF 



\ (R-psf) 



p4 
-"•PSF 



(29) 



where the spatially constant term in the first line disappears 
as a delta function at ^ = 1 (or the fundamental mode of 
the survey) in Fourier space; a similar cross term involving 
the (implicit) bias on epsp is also zero in the second line. 
Note that all ellipticities have two components that add in 
quadrature. Ignoring a bias term in M proportional to the 
square of one already present (therefore negligible if the bias 
is small), we also find 



M 



^gal 



{i?pc 



+ 



Ric 



■^gal 



(Rpsp) 



(30) 



We shall explore concrete values for the terms in equa- 



tions (29 1 and (30 1 in Section 4.5 For now, notice how the 



systematics are driven mainly by the size of the PSF — 
to the fourth power, which is why cosmic shear measure- 
ments are generally easier from above the Earth's atmo- 
sphere. However, 5{Rpsp) terms (proportional to only the 
second power) arise if the PSF is wavelength-dependent and 
measured from stars that are a different colour to galaxies 
(Cypriano et al. 20101. This effect is worse for diffraction- 



limited space-based observations than ground-based imag- 
ing, where the PSF is determined primarily by atmospheric 
turbulence. It would likely be spatially constant (and there- 
fore disappear from A at least), except that chromatic aber- 
ration may exacerbate it on a characteristic scale related to 



the size of a telescope's field of view (Plazas & Bernstein 



20121. It is anyway a function of redshift. 



Equation ( 29 1 in particular shows that overall perfor- 
mance is driven by the product of instrument stability and 
knowledge about that instrument. This quantifies the trade- 
offs discussed by |Amara, Refregier fc Paulin-Henriksson| 
(j2010|. To obtain relaible cosmological measurements, we 
first need high-quality instrumentation to deliver a system 
PSF that is 

• small (i^psp), 

• nearly circular (the bias component of (|epsF|^)) and 

• stable (the variance component of {|epsFp); we have 
already assumed that its size is constant). 

It is then equally important to 

• understand and accurately model that PSF. 

The (5) terms reflect a calibration bias in the PSF model 
{e.g. in its colour), and are likely to spatially constant. The 
terms reflect a lack of knowledge [e.g. from sparse sam- 
pling of a spatially/temporally varying PSF pattern), and 
are likely to vary as a function of scale in such a way that 
they are largest around the mean distance between stars, 
the size of the telescope's field of view or (reflecting the in- 
trinsic variation in the PSF pattern) turbulence cells in the 
atmosphere. 



3.2 Imperfect correction for detector effects 

As well as convolution with a PSF (which in practice can 
include all optical and electronic effects that act linearly 
on pixel values), astronomical images can also be degraded 
in more complicated ways. This can include global detector 
nonlinearity, in which the number of counts in each pixel 
is a nonlinear function of the incident flux, or nonlocal ef- 



fects such as Charge Transfer Inefficiency in CCDs ( Janesick 



2001 1 and inter-pixel capacitance or persistence in HgCdTe 
devices ( [Barron et al.||2007| |McCulloughl|2008[ [Seshadri et" 
al.|2008| ). 

These operations cannot be treated mathematically as 
a convolution, so the correction procedure outlined in Sec- 
tion |3T] does not apply. We therefore introduce a new cat- 
egory of non-convolutive (NC) perturbations in galaxy size 
-Rnc and ellipticity £nc- The details of these may depend 
on the fiux and size of the galaxy, but we take a generic 
approach (which can hold for small, faint galaxies) in which 
the observed quantities become 



-Rob 



and 



+ Rt. 



Egal) + SnC 



(31) 



(32) 



Sobs = Sgal + p2 I r2 (^PSF 

^gal ' -"psf 

Note that we have not explicitly included non-convolution 
effects on stellar images from which the PSF is modelled. 
The images of bright stars will also be degraded, and the 
budgets for SRpsF and <5epsF should allow for this. How- 
ever, many of the most serious nonlinear effects operate in 
the sense that the degradation of bright sources is much less 
than that of faint sources ( [Massey et al.|20To{|Hoekstra et aL] 
20111. In this case, the perturbations on galaxies -Rnc and 
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snc will dominate, in the budget for galaxy shape measure- 
ment rather than the (separable) budget for PSF modelling. 
A weak lensing analysis then seeks to recover 



I 1 ^PSt I / 

Sgal ~ 1 + 'Jvi I (Sobs — £NC) 

V -"gal 



^PSF 



EPSF 



(Eobs — ENc) (-Robs — -Rnc)^ — EPSF-RpSF 



{Rohs — Rnc) 



p2 

-"•PSF 



(33) 



(34) 



where we have taken care on the second line to include only 
observable quantities on the right hand side. 

In practice, any correction scheme will inevitably have 
inaccuracies (5i?NC and ^snc, so only an imperfect estima- 
tion is possible of Eg^i. Again we expand the shape observ- 
ables as a first order Taylor series 



£gal ~ Egal ^ 



de 



gal 



a(7?2 



de 



gal 



psf; 
9egai 



9epsF 



SepsF + 



S{Rnc) + '^SSNC : 



(35) 



where the first two partial derivatives remain unchanged as 



(the second form of) equations 1 24 1 and (251, and 



2Ri 



9-Rnc 



' (-Robs — -Rnc) (gpbs — £nc — £psf) 
[(-Robs - -Rnc)= 



-Rpgp) 



2-RpSF (ggal ^ EPSf) 



(^^al 



+ Rl 



de. 



;al 



-{Rohs — -Rnc)^ 



-Rgal + -RpSF 



9eNC {Rohs — Rnc)^ — -RpSF 

Thus 



Egal ~ Egal \ 1 + 



"gal 



(36) 



(37) 




(38) 



This imperfect ellipticity measurement translates into 
additive cosmic shear systematics 



A- 



+ 



P 2 

1 

1 J 



^PSF \ „2r_ , 

j o IjepsFl 



^gal 



2ri I 



epsFp) / -RpsF \ ( {'^(-Rpsf))^ , (^^[-Rpsp] 



P 2 



-^gal / \ (-^PSf) 



r4 

-Ttpgp 



-H4 



(IspsfT) IR%sy \ [ {SR^cf a^[RNc] 



P 2 



-^gal 



p2 

-"-NC 



(39) 



Mixing thus emerges between corrections for convolution 
and non-convolution effects. In the second term for example, 
imperfections Se^c iu the correction for detector effects are 
enhanced during subsequent deconvolution. 



The multiplicative cosmic shear systematics 



M^2 



RpSF 



Rt 



( {S{R 



■psf)) 
psf) 



. ('^(-Rnc)) 

(-Robs) 



'^'[^psf] ^^'^'[^nc 



(-Rpc 



-^obs) 



(40) 



plus bias terms proportional to the square of those already 
present (therefore negligible if the bias is small). The sec- 
ond term {(5 .Rnc) reflects overall uncertainty in the model 
of non-convolution effects, such as the density and character- 
istic release time of charge traps in CCDs. These quantities 
may be stable over long periods of time, but the error may 
vary as a function of object flux (hence redshift) if, in this 
case, the CCD well-filling model is inaccurate. The fourth 
term ct'^I^-Rnc] reflects unaccounted variation of an effect at 
different positions within a detector. Depending on survey 
tiling strategies, NC terms are likely to be largest on phys- 
ical scales corresponding to linear multiples of the chip size 
(see Cropper et al.|2012| ) . 



3.3 Imperfect shape measurement methods 

In the previous sections we examined the impact of errors in 
the measurements of the PSF and detector effects, but we 
implicitly assumed that the observed galaxy moments are 
unbiased. In practice, the unweighted size -Robs and shape 
Eobs of a faint galaxy may be subject to errors 5-Robs, Seohs 
for a whole variety of reasons including mis-centering, back- 
ground gradients/structure, pixellisation, and simply noise. 
We therefore need to consider also the impact of imperfec- 
tions in the measurements of the galaxies. This leads to new 
contributions to the observed ellipticity 



7 ^ 



e^gai 



+ 



Sgal 
1 



(41) 



1 



5{Rhp) + ^^5epsF 



Py depsF 



+ 



+ 



9Sg,!il CCD \ , 1 C^Sgal c 

d(-RNc) + t^^^T^oenc 



By 9(-Rnc) P-t deNC 

^J^S{RL.) + ^^Seoh 



P. d{Rl^J 



^-.al.(42) 



The new partial derivatives of ( 34 1 are 



de. 



gal 



d{Rlhs) 



RpSF (ggal - £PSf) 
-Rgal -Robs(fiobs — Rnc) 



deoh 



-Rgal + ^PSF 
-"gal 



(43) 



(44) 



Alternatively, note that degzx/dRohs ~ —de^ax/dR^c- In- 
cluding observational error, we thus find the shear measure- 



ment 7 has biases given by STEP parameters (eqn. 11 1 

1 -RpSF j ^gal + -RpsF 



P R2 

-^7 -"-gal 



-Ttpgp 



{Seohs - Se^c) - SepsF (45) 



(S{R 



PSF) _|_ 2^-Rnc _^ 



5 (-Robs) 



V -^PSF ^obs — PnC Pobs(Pobs — Pnc) 



EPSF 



© 2011 RAS, MNRAS 000,[T]{T9 



Origins & requirements for weak lensing systematics 7 



T}2 



+ 



Rohs — RnC Roha^Rohs 



Rnc) 
(46) 



We have so far considered only shape measurement us- 
ing unweighted moments. This approach greatly simplifies 
the calculations, but potentially limits the applicability to 
real data. This is because the presence of any noise in an 
image formally leads to infinite noise in the measurements 
of unweighted moments. It may be feasible to measure (close 
to) unweighted moments in the special cases of very bright 
stars, or of repeated detector effects, by stacking data to 
suppress the noise. 

It is never possible in practice to measure directly the 
unweighted moments of faint galaxies, and one has to use 
weighted moments instead. The optimal weight function to 
use, is the one that maximizes the signal-to-noise ratio, 
which in turn implies that the weight function closely re- 
sembles the galaxy profile. This is naturally done by meth- 



ods that fit parametric shape models to the data {e.g. Brl 



die et al. 2001 Miller et al. 


2007 Kitching et al. 20081. 


Moment based methods {e.g. 


Kaiser, Squires & Broadhurst 



RRG) instead construct sizes -Rgaiw and eUipticities Egaiw 
from quadrupole moments weighted by a radial Gaussian 
function, the size of which is matched to the object. There 
are no simple expressions that relate 7?gai and Sgai in terms 
of observed weighted moments, equivalent to the unweighted 



versions ( 31 1 and ( 32 1 . Derivations using weighted moments 
are complicated and involve mixing of higher order moments 
(KSB ; RRG; |Kaiser|[2000l |Refregier|[2553l [Melchior et"ar 
2011). For any individual galaxy however, it is possibl^ 



to define without loss of generality various P' quantities to 
form a shear estimator from weighted moments 



1 



p/ d2 

-"-galw 



^obs w 



SNC\ 



1 ^PSF 



p p/ p2 

JTj JTf^ ^galw 



V P' 



(47) 



where we have intentionally arranged terms to resemble 



equation (331. For individual galaxies, especially those with 
complex intrinsic shapes, it can be that ^ 7, as long as 
averaged over a large population of galaxies, {7w) ~ {7). 
The P' quantities fulfil two roles, and can even be ex- 



For example, the shear estimator in KSB (in the absence of 
non-convolutive effects) is 

1 



7w = (P, 
where 
^KSB 



7 \- 
ksb) 



^obs \ 



rjsm / rjsm \ — 1 rtsh 
^ V-fpSF^ -''^PSE 



The interpretation of such quantities is method-specific. If 7^ ~ 
(P-y)~^(P^ )~^eobswi the middle factor can be interpreted as 
part of either the polarizability (e.g. [Kaiser, Squires &: Broadhurst] 
|1995|[Massey et al.|2007c| use higher order mo ments to construct 
(PryPi , )~^), or as part of the ellipticity {e.g. Rhodes, Refregier 



I&: Groth|2000l Kaiser 2000l use higher order moments to convert 
weighted eUipticities to unweighted eUipticities (P^^^^)~-'^egaiw)- 



pressed as the product of discrete quantities 
PL = W^P^. 



(48) 



Both components of P'^ are tensors, but they are nearly di- 
agonal, so for simplicity we shall treat them as scalars. The 
first component Wx compensates for the weight function's 
changes to moments, e.g. 



T}2 

-"-galw 
ePSFw 



Wr 



We 



EPSF 



(49) 
(50) 



etc. Numerical values of this component depend upon the 
shape measurement method, but for small galaxies Wr ~ 1 
as it governs a ratio of similar quantities and We ~ 1/2 (for 
all the eUipticities). The second component Px encodes the 
way in which the effective PSF is altered by the weight func- 
tion, and its numerical values depend upon the PSF prop- 
erties. For a Gaussian PSF, all P values are exactly equal 
to 1. This approximately holds for a smooth {e.g. ground- 
based) PSF or a small PSF (or a large galaxy). For an Airy 
PSF, the outer diffraction wings are damped by the weight 
functiorj^ leading to large differences between weighted and 
unweighted quantities. For large galaxies, the weight func- 
tion will be extended and the suppression is small. For 
small galaxies, size estimates are most affected, and we find 
Pr ~ 2: the net effect of the weight function is equivalent 
to reducing the PSF size. EUipticities are less affected, with 

~ 1 in any observing regime. This depends weakly on 
the intrinsic ellipticity and size but, since we shall gener- 
ally consider limiting cases of small/faint galaxies, we shall 
henceforth treat these factors as constants. 

We now re-evaluate the additive and multiplicative bi- 
ases, accounting for the use of weighted moments. This could 
involve replacing all mentions of observable sizes and shapes 
by their weighted equivalents. However, for comparison with 
our earlier results, and to eventually express engineering 
requirements on instrumentation, it is more convenient to 
continue to use unweighted quantities. Substituting equa- 



tions (491 and (50 1 into (47 1, we find 



7w 



1 



1 + 



Pr R^ 



gal 



^obs 



i-PSF 
?2 



epsF 



P-y Pr Rg^i V^epSF 



(51) 



This expression clearly demonstrates how weighted moments 
can naturally suppress bias. However, this advantage comes 
at a price. The evaluation of the P factors requires knowl- 
edge of higher order shape moments, which can be well 
known for bright stars but are especially noisy for faint 
galaxies. The absolute values of PepsF 1 ^enc ^^'^ ^^oba ^-d- 
just the balance between different contributions to the bias, 
but errors in those quantities are functionally identical to 
errors in epsf, enc and Eobs, which we have already con- 
sidered. Observational errors in Pr propagate into a new 



^ Consider the pathological example of a PSF consisting of a core 
plus a ring at large radius. The ring lowers the perceived flux of a 
galaxy, but has no effect on its size or shape as determined from 
weighted moments. 
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source of bias, via 
57w 



depsF 



9(7?Nc) osnc 



(52) 



where the derivatives of 7„ gain prefactors of 1/Pr or 
I/PrPe compared to those of 7 and 



r2 



gal ~ -"-PSF 



7w+p5^). (53) 



This general case thus has additive cosmic shear systematics 
1 / 7?|.sf\ a2[|epsF|] 



p 2p 2 



p 2 



Pr + 

"gal 



+ 



+ 



where 



p2 p2 p 2 ' 

fl^ 7 EPSF 


\^^al 


4(lepsFl') 


/ -RpSF 


p2 p2 p 2 
ii-' 7 EPSF 


\^^al 


(|epsFp) / 


' ^PSF 


p2 p2 p 2 ' 
ij-' 7 ^PSF 


\ -Rgal 



• [|£nc| 
p 2 



V (Ppsp) 



\ (Pnc) 



+ 



R2 



(57) 



If the size of the PSF depends upon wavelength, this 
term introduces a sensitivity to spatial variations in the 
colour of a galaxy (whereby the PSF is different in the bulge 
and the disc: Voigt et al. 2012 Semboloni et al. in prep.). 



This is because multiple galaxy profiles result in galaxies 
with identical observed moments, so the estimate for Pn be- 
comes biased. Similar biases in Ph also arise in parametric 
fitting methods if the model does not refiect galaxies' true 
morphological characteristics (Voigt & Bridle 2010), suffers 



from aliasing ( Bernstein|2010 1, or is nonlinear ( ,Refregier et| 
al.|2012[ ). In this paper we do not distinguish between these 
individual origins, but consider all such effects part of a gen- 
eral method bias. 

We conclude that a shear estimator 'y^ constructed 
from weighted moments has STEP biases 

1 RpsF J [ PrPIri + RpsF \ / SsohB _ ^£nc \ 



P-iPr -Rgal 



r2 



(5epsF EpsF ( 5(-Rpsf) 



-rtpgp 



-I- 



+ 



-Pfl-Rgal 



2 5Rnc 
RohB — Rnc 

5Pb. 



Rohs{Robs — Rnc) PpSFiPR-Pgal ~^ PpSp) PR 



(54) 



2 _ {s{RLB)y , 



(PL 



-ttgal 
-rtpgp 



PrRIh^i 



PrRI 



(PI) 
(58) 

The first term in could arise due to pixellisation effects, 
but this will be zero for resolved imaging and deviations 
could be measured only by changing the plate scale in a 
camera. Note that if autocorrelations of galaxy shapes with 
themselves are included in the correlation function analysis 
(see Section [2. 1[), the additive cosmic shear systematics gain 



an extra white noise term a^{zA, zb) as in equation (18 1 



The multiplicative cosmic shear systematics become 

f {S(R 

\ (PpSF 



M = 



2 



RpSF 

RL. 



^psf)) 



p 2 



pu 



(Rpsp) 



4-4- 



(-Robs) 

tHRnc] 

(PL.) 



(59) 



4 REQUIREMENTS TO MEET FUTURE 
SCIENCE GOALS 

4.1 How much systematic bias is tolerable? 



^psf 



-^gal 



fSiR 



psf) 



V p 
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PSF 
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2 5i?NC 
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where 



'5(^obs) 



Pobs(-Robs — Pnc) 
SP^ 



-Pr 



^gal 
R2 

-rtpgp 



+ 



R„ 



PrPIsi + -RpsF 



SPr 
Pr 



(55) 



(56) 



is the component of bias due to the galaxy shape measure- 
ment method. The STEP parameter q, which fiags an (in- 
correct) quadratic response to shear, could be produced by 
measurement errors that depend on intrinsic ellipticity such 
as deobs(egai). Averaged over a galaxy population, these are 
functionally identical to errors SP-y. 

Observational errors are likely isotropic i.e. {Ssohs) = 
and spatially constant i.e. in the absence of galaxy-galaxy 
autocorrelations o-^^R^^^s] = o"^[leobsl] = a'^[P^\ — a'^[PR]=Q. 



Total experimental error from any measurement is always a 
combination of systematic and statistical errors. Systematic 
bias [i.e. the deviation of a measured value from the truth) 
can be reduced by e.g. stabilising a telescope or raising it 
above the Earth's atmosphere. Statistical error [i.e. the con- 
fidence interval allocated to a measured value) is limited by 
the finite number of measurements within a survey, and can 
be reduced by e.g. increasing survey volume. The diagrams 
inset within Figure [l] illustrate how an unrecognised sys- 
tematic bias shifts measurements, which are drawn from a 
statistical likelihood distribution around the offset value. 

Classical astronomical survey design optimises an ob- 
servation that is limited by systematic biases inherent to 
a technique or its interpretation. This limitation drives sur- 
veys wider, deeper or to higher resolution, until their statisti- 
cal error becomes smaller than the systematic bias. However, 
several surveys planned for the next decade have scientific 
goals that require them to image the entire sky outside the 
plane of the Milky Way. Further increasing survey area is im- 
possible, and increasing survey depth can be prohibitively 
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Figure 1. The effect of a systematic bias in an experiment, as a function of statistical error a, assuming all likelihood distributions 
are Gaussian. The y axis is the chance that a reported experimental result is drawn from the reported likelihood distribution, centered 
around the true value. The two lower (black) curves show this chance in the presence of some exact systematic bias b, for the reported 
statistical errors (lower) or total errors (upper). The top (blue) curve shows this chance if b is instead a 95% confidence limit on the 
(unknown) true bias. In this case, the true bias could be zero, so the overlap with the ideal likelihood distribution is always greater. 
Using this latter definition, we require b < O.SIct for a 95% overlap with the ideal PDF. 



expensive: especially for space-based surveys, where mission 
cost jumps in step functions with mirror size (bigger launch 
vehicles become necessary) or survey duration (additional 
redundancy of components). For these surveys, the statisti- 
cal error is fixed and the classical trade-off is inverted; the 
key question becomes how much systematic bias is tolerable? 
We shall answer this quantitatively by considering the prob- 
ability with which an experiment's reported measurement of 
a particular parameter could have been obtained by an un- 
biased experiment. This is the overlap integral between the 
likelihood distribution reported by an experiment, and the 
likelihood distribution that would have been reported by an 
unbiased experiment {i.e. the same distribution, re-centered 
around the parameter's true value)j^ 

Throughout this section, we shall consider statistical er- 
rors described by a Gaussian of width a. There are two ways 
in which a systematic bias can be described. Following fre- 
quent use in the literature, and as illustrated in the upper 

^ This is a frequentist argument based on p- value like statistics; a 
Bayesian methodology, in which evidence ratios for a target model 
with and without systematics could also be considered. We shall 
also only consider the 1 dimensional bias on a single parameter 
at a time (c./. |DodeIson, Shapiro fc Whitc„2006, ,Shapiro„2009, 
Shapiro et al.|2010| . 
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inset panel of figure [T] we shall first consider an experiment 
with an exact amount of bias, b. The lowest curve in Fig- 
ure [l] shows the probability that a reported measurement 
could have been sampled from the unbiased (re-centered) 
likelihood. This is simply the (cross-hatched) overlap inte- 
gral under two Gaussians with variance ai — a2 = a and 
mean fii — 0, ^2 = b 




For the overlap to be at least 95% (90%), the absolute value 
of bias |ti| must be less than O.lSa (0.25o-). If bias is allowed 
to be as large as the la statistical error, the overlap inte- 
gral is only 62%, which is undesirable. One effect slightly 
improves this: as illustrated in the lower inset diagram, re- 
ported error bars will be enlarged to account for an estimate 
of the systematic bias. The middle curve in Figure [l] shows 
what happens if the achieved level of bias were treated as a 
95% confidence limit on a Gaussian systematic error budget, 
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I.e. at = b/2. In this case, the overlap integral becomes 



total /7 \ 

Povcrlap V J 



da; 



(63) 



although this does not significantly affect p. 

However, any systematic bias that is known exactly 
would already have been subtracted from a measurement! 
We shall now re-interpret b as the 95% confidence limit on 
the absolute value of an unknown bias. A Gaussian distribu- 
tion of possible biases with mean zero and width at — b/2 
sometimes creates small or even zero bias, so the overlap of 
reported and ideal measurements is greater. Marginalising 
over this distribution, the top curve in Figure [l] shows 



marginalised /t \ 

Poverlap \") — 



total / 7 / \ M f 
Povcrlap (b)db. 



(64) 



Achieving a 95% (90%) probability that a reported result 
could have been drawn from the likelihood distribution re- 
centered on the true value now requires |6| < 0.31cr (0.62ct). 
Only 69% overlap arises if the systematic and statistical er- 
ror budgets are equal (at — ct). We shall henceforth require 
uncertain biases to have a 95% confidence limit that is less 
than 31% of the la expected statistical error. 



4.2 Propagation of shear measurement errors to 
biases on cosmological parameters 

We now propagate hypothetical shear measurement errors 
A{£, za, zb) and M{(., za-, zb) from Section [3] into biases on 
derived cosmological parameters, via the Fisher matrix bias 



formalism (Taylor & Watts 2001, see also Amara & Refregier 



2007 Kitching et al.|2009b \. In particular, we concentrate on 
measurements of the dark energy equation of state parame- 
ter w or its derivative Wa ( Chevalier &: Polarski|2001 Linder 



2003), and marginalise over other parameters. By requiring 



that the 95% confidence limit on bias is less than 31% of 
the statistical errors afforded by Poisson noise in a finite 
survey volume (see Section [4. 1[ ), we obtain requirements on 
the accuracy with which the PSF must be modelled, detec- 
tor effects must be corrected, and galaxy shapes must be 
measured. This is more stringent than the work of |Amara fc| 
Refregier (20081, who required bias less than 100% of sta- 



tistical error. 

We assume a baseline 15, 000 square degree cosmic shear 
survey resolving 30 galaxies per square arcminute with me- 
dian redshift of 1.0 and split into 10 tomographic redshift 
bins. This matches the configuration of the proposed Euclid 
mission ( Laureijs et al.pOll I, and is likely to be similar to 
any proposed Stage IV survey: for example, LSST proposes 
to survey 18, 000 square degrees with an effective density of 
40 galaxies per square arcminute ( Jee Sz Tyson|20TT Brad- 



shaw A. et al.|2012[ ) . |Das et al.| ( |2012[ ) describe the effect of 
perturbing the parameters of the baseline survey in a similar 
analysis. 



We use the iCosmo Fisher matrix software (Refregier et 
al.|2011 Kitching et al.|2009b I to calculate the concordance 



ACDM cosmic shear power spectrum C{£, za, zb) in a top- 
hat basis set (200 bins) spanning scales 10 < / < 5000 and 



every pair of redshift bins. Henceforth, £, za and zb refer to 
the median values of the population of galaxies within these 
bins. We assume the Limber approximation, and neglect any 
power spectrum due to intrinsic alignments. Using only weak 
lensing measurements, such an experiment can measure w 
with a la, 1-parameter statistical error of 0.065, and Wa 
with a statistical error of 0.41. 



4.3 Constant additive and multiplicative shear 
measurement bias 

To first explore the consequences of the simplest possible 
systematic errors, we first impose upon each measurement 
of C{£) a constant additive shear measurement bias A (or 
(Jc) and a constant multiplicative shear measurement bias 
A4 (or m). This simultaneity of multiplicative and additive 
biases has not been explored before, with previous studies 
in the literature considering the imposition of only one type 
of systematic at a time. Note that although a^ is positive 
by definition, and m is almost always negative in practice 



{e.g. Bridle et al. 20101, we explore positive and negative 
values in both cases because if their values are known, they 
would be removed from data (or added to models). The only 
important parameter is the residual after this process, i.e. 
the accuracy to which A and M are known. By definition, 
this residual is equally likely to be either positive or negative. 

We find that there is a degeneracy between the two 
types of bias, in terms of the way they infiuence constraints 
on the dark energy equation of state parameter w (Figure [5] 
top panel). Indeed, if A and M have the same sign, they can 
cancel each other out to produce no net bias on w. However, 
the tuning of this cancellation is specific to the parameter 
being measured: the degeneracy is completely different for 
measurements of Wa, or as. 

Given our first order expansion, it is not surprising that 
the surface of figure [2] is approximately fitted by a plane 
b/a ^ -0.093 - 3.9m + 3.3 x 10^°cr^. Thus, if the signs of 
A and M are not known a priori, guaranteeing |b| < 0.31a 
requires 



\rn\ + 8.1 



0.10. 



(65) 



Whilst surprisingly large constant m can be acceptable for 
measurements of w (since MC{£) does then not resemble 
dC{£)/dw), we again note that this is not true for mea- 
surements of other cosmological parameters. Most impor- 
tantly, we note the necessity for joint requirements on A 
and A4. Whenever requirements are placed on A when as- 
suming = or vice versa, one degenerate error budget is 
being spent twice. The two requirements should be halved 
and, since the bias surface is well-fit by a plane, the two re- 
quirements can be linearly traded against each other. This 
degeneracy has not been taken into account by earlier work. 



4.4 Simple forms of additive and multiplicative 
shear measurement bias 

As discussed in Section [3] systematics often affect some 
physical scales more than others, and it is typically more dif- 
ficult to measure the shapes of distant (small, faint) galaxies 
than nearby (big, bright) ones. One feasibly more realistic 
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Figure 2. The (absolute value of) bias on measurements of the 
dark energy equation of state parameters w (colour and solid 
contours) and Wa (dotted contours) from weak lensing surveys 
with multiplicative and additive shear measurement systematics. 
Bias is shown as a multiple of the expected statistical error a, and 
contours are drawn at the same values as in Figure^ Top panel: 
constant systematics m and cr^. Bottom panel: one realisation 
of variable systematics M{z) and •4(£), as described in the text 
(note the change of scale). 



functional form for non-constant additive systematics is 



Ail) = ao[l + 
where £i 



lo 

1000, /3i 



/3l 



1.5, /32 



(66) 



-3 (eqn. 29 of Amara, 



Refregier & Paulin-Henriksson 20101. A feasible functional 

(67) 



form for multiplicative systematics is 

M{zA, zb) = m{zA) + m{zB) + m{zA) x m{zB) 
where 



m{z) = mo — (1 + z) 



/3„ 



tan 



(68) 



with Q,„ = 10, Pm = 1.5 and a transition in sign at zt ~ 1 
(eqn. 20 of [Amara fc Refregier|2008l ). 

The bias surface for this parameterisation (Figure[2]bot- 
tom panel) is well-fit by a plane b/a ^ 0.031-1- llOmo — 7.5 x 
lO^'^ao. This means that, while the error budget 

|mo| -f6.7 X lO^laol ^ 2.8 X 10"^ (69) 



must again be split between additive and multiplicative sys- 
tematics, the allocations can still be traded linearly against 
each other. Note that absolute requirements on parametric 
variables ao and mo are tighter than those on and m 
partly because the unnormalised functions are much lower 
than unity, and partly because A and MC are now more 
similar to dC/dw. 



4.5 General forms of additive and multiplicative 
shear measurement bias 

Since the real scale-dependence of systematics will remain 
unknown for any survey (even after its completion), we now 
use a Monte Carlo approach (c./. Kitching et al.|[2009a l to 
explore all possible functional forms of ^ and M. We explore 
this very high dimensional parameter space separately for 
each type of bias, but remember the caveat about duplicated 
error budgets and the necessity /ability to trade between re- 
quirements on each. In general, requirements will emerge 
upon the functional forms of A and A4. For tractability, we 
collapse each function to a single number 



A-: 



E..bins2^/tr \A<^,^A,ZB)\fd 



In I 



2ir Jl, 



P dln£ 



(70) 



^ ^ E..bins2^/t:: \M{l,ZA,ZB)\f Alnl ^^^^ 



^Z'bins 2tt J t. 



Thus we generalise a^y^ in Amara & Refregier ( 2008 1 to 3D 



correlation functions, and include a renormalisation, by way 
of the denominator, that reduces sensitivity to changes in 
the adopted ^-range. Values of these performance indica- 
tors are shown on the right and upper axes of Figurej2] 
For our baseline survey, the denominator in 1 70 1 and ( |71[ ) 
is 55 X g.OxlO''. For the shorter £-range used by GREATIO, 
the denominator is 1.8 x 10"". Other possible choices for the 
weighting inside the integral, and the slightly different ap- 
proach required for practical calculations in GREATIO, are 
discussed in Appendix B. 

To span the space of possible systematics functions, we 
generate 100,000 random realisations of A{£, za, zb); for 
now, we set M = Assuming conservatively that generic 
systematics contribute equally to all scales and redshift bins, 
we generate random systematics by drawing the value of 
ac{£, z) in each £ and z bin from a Gaussian PDF cen- 
tered about 0. The width of the Gaussian remains fixed 
as a function of £ and z, and we repeat this process sev- 
eral times with increasingly wide Gaussians (spanning a 
range that includes current performance and future require- 
ments). We then smooth CTc with a 2D boxcar of width 
50 (of 200) £ bins and 3 (of 10) z bins, and construct 
A{£,za,zb) = (Jc{£, za)<7c{£, Zb)- The smoothing refiects 
the typically continuous form of systematic effects; it is im- 
portant here because (unrealistic) realisations of systematics 
that are uncorrelated between adjacent bins cause less bias 
in cosmological parameters. The precise amount of smooth- 
ing (particularly in the £ direction) affects requirements on 
A by around 15% of the nominal value. While this precision 
is adequate for current planning purposes, detailed analysis 
in the future will require more accurately constrained forms 
of A and M to be propagated. 
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Figure 3. The bias on measurements of the dark energy equa- 
tion of state parameter w from weak lensing surveys with (top 
panel) additive and (bottom panel) multiphcative shear mea- 
surement systematics. Each data point shows a random reaUsa- 
tion of systematics with a unique dependence upon angular scale 
and redshift (for clarity, only one in three are plotted). The dot- 
ted diagonal lines show the bias on cosmological parameters if the 
shear measurement systematics are constant. The solid diagonal 
curves show limiting values that include 95% and 99% of random 
realisations with a given value of A or M. An all-sky 3D cosmic 
shear survey will only be deemed successful if the measurement 
bias is ;$ 31% of the statistical measurement error. At 95%CL, 
this will require (vertical dashed line) shear measurement better 
than ^ < 3.5 X IQ-^^ ii M = 0, and ^ 8.0 X lO'^ if ^ = 0. 



We propagate our random realisations of biases on the 
cosmic shear power spectrum into biases on w using the 
Fisher matrix bias formalism as before. The largest biases 
are generated when the shape of A{1, za,zb) is close to that 
of dC{l, za, zb) /dw. To ensure that the bias on w is less 
than 31% of the statistical error for 95% of the random re- 
alisations, we require 



^ < 1.8 X 10" 



(72) 



(see Figure[3^), including a factor of 1/2 for a non-zero bud- 
get on M. This general requirement is a factor of only ~ 3 
tighter than the requirement if A is constant (see the upper 
panel of figure [5] or the dotted line in figure [3^) , demon- 
strating how bad constant additive systematics can be. Con- 



versely, it is a factor of ~ 3 looser than if A is restricted to 
the family of curves parameterised by equation ( 66 1 (see 



lower panel of figure [2| , which was a pathological case in 
the worst 1% of random configurations. 

For the smallest resolved galaxies iiobs ~ 1.25i?psF (i.e. 
^?gai = 0.75i?psF), in the regime of the most elliptical PSF 
typically obtained from astronomical instruments |epsF| ~ 
0.1, and with an Airy PSF such that Pr ~ 2, equations (57 1 



and ( 72 I together become 



A: 



0.79a [|(5epsF|] -|-5.2o-^PeNc|: 



4- 0.0023 



JXpgp 



+ 



JXpgp 



.0.0091 1 



2 a^[i?Nc] 

-"-PSF 



-f 0.0023 



p4 
^obs 



< 1.8 X 10" 



(73) 



Note that ^epsf at least is likely to have two components 
that each contribute to the total bias. 

We then generate 100,000 random multiplicative shear 
measurement biases m{l, z) in the same way and with the 
same smoothing. We propagate these into multiplicative cos- 



mic shear systematics Mil, za, zb) via equation (67 1, and 
hence into biases on w. To ensure measurement bias is less 
than 31% of statistical errors for 95% of the Monte Carlo 
realisations, we require 



< 4.0 X 10" 



(74) 



(see Figure |3]d), including a factor of 1/2 for a finite er- 
ror budget on A. This is a factor of ~ 20 tighter than the 
requirements if M is constant (see the lower panel of fig- 
ure [2] or the dotted line in figure [sJj) , demonstrating again 
that a constant multiplicative shear measurement bias has 
surprisingly little effect on w constraints (note that it does 
strongly affect constraints on Q.m and (Tg). The amount by 
which the random systematics are smoothed (particularly in 
the z direction) affects requirements on M by around 10% of 
the nominal value. For the smallest resolved galaxies, equa- 
tions (l59l and ((741 become 



M ~ 1.8 



psf)) 



(-Rpsp) 



4.0 X 10" 



(75) 



plus redundant variance terms that are already constrained 



more tightly by equation (73 1, so which we drop here 



A top-down analysis can now allocate error budgets to 
each of the components of A and , as expanded in equa- 
tions (73 1 and (75 1. In the absence of other information, a 



natural choice would perhaps allocate budgets in, perhaps 
in inverse proportion to the coefficient by which they af- 



fect the overall science. Cropper et al. (20121 provide one 



such breakdown of these error budgets that is feasible in a 
dedicated space mission. 



4.6 Comparison to other work 

Our calculations differ from those of [Amara fc Refregier| 



( |2008[ ), [Chang et ah] \2Q12\ and Cardone et al. (in prep.) 



by using a form-filling approach to consider any possible 
^-dependence of systematics, rather than just parametric 



forms. Amara & Refregier (20081 also assumed only a 2D 
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cosmic shear analysis, with a slightly lower redshift dis- 
tribution of source galaxies, and considered power spec- 
trum measurements up to scales I < 2Q, 000; we exclude 
such non-linear scales because poorly-understood effects of 
baryonic physics are likely to make them difScult to inter- 



pret (Semboloni et al. 2011 Kitching, Heavens & Miller 



20111. The denominator we introduced in equations (701 



and ( 71 1 keeps our new A and M performance indicators 



independent (within a few percent) of this choice of ^-range. 
However, if future understanding of small-scale baryonic ef- 
fects could indeed extend cosmic shear measurements to 
I = 20,000, statistical errors a would shrink by ~10%. Ex- 
ploiting this new information would require correspondingly 
smaller shear measurement biases. 

We can mimic the 2D notation of |Amara &: Refregier| 
| |2008[ ) by multiplying our 3D requirement ( 72 1 by its de- 
nominator, dividing it by the number of (in our case 55) 
redshift bin pairs that we considered, and including small 
corrections. This process, plus small corrections for a few 
other differences (^-range, ^-distribution, |b|/CT<l, 100% CL) 
yields a pseudo-2D requirement a factor of ~2 looser than 
their a^yg ^ 10~^ per redshift bin. That difference presum- 
ably arises from the details of the redshift slicing, and we 
shall not consider it further. 



5 CAN THE REQUIREMENTS BE MET? 

5.1 Current best shear measurement performance 

The performance of shape measurement algorithms can 
be tested on simulated astronomical images that con- 
tain a known shear signal. Blind competitions include the 
community- wide STEP ( Heymans et al.|2006 Massey et al 



2007b [ ) and GREAT ( [Bridle et al.|2010||Kitching et al.|20TT l 



programmes; these have been and are continuing to be sup- 
plemented by efforts by individual groups targeted towards 
specific surveys. Assessed using the GREAT metric Q, these 
programmes have yielded a steady improvement by a factor 
~3.5 per year over the past decade ( Kitching et al.|2012b l. 

GREATIO is the most recent blind competition, and the 
first to employ variable shear simulations, which are required 
to test scale-dependent issues. The best methods entered 
into GREATIO achieve A ~ 2.7 x 10"^^ and ~ 3.1 x 10"^ 
on bulge-|-disc galaxies at detection S/N=40 (Table 4 of 
[Kitching et al.|2012a| in which these values are expressed as 

and M/2, but see Appendix B for a discussion of slight 
differences in approach). For these fairly bright galaxies, cur- 
rent performance surpasses the requirement on A4 and the 
requirements on A and M can be traded against each other 
to also be met in combination. Note, however, that this 
shape measurement inaccuracy uses all but 1% of the en- 
tire error budget. GREATIO assumed a spatially/temporally 
varying PSII^ but that it was perfectly known, and that 



non-convolution effects could be perfectly corrected. Further 
development in shape measurement will be necessary if part 
of the error budget is to be set aside for e.g. PSF or CTI 
modelling errors. 

Faint galaxies are harder to measure, but must be in- 
cluded to reach Stage IV surveys' statistical goals on cosmo- 
logical parameter estimation. At detection S/N=20, the best 
methods now achieve ^ ~ 2.1 x 10"" and X ~ 5.6 x 10"^; 
at detection S/N=10 they achieve ^ ~ 7.4 x 10"" and 
>1 ~ 1.1 X 10'^ If all galaxies were this faint, exploiting 
them (consuming all of the available error budget) would 
exceed requirements in by a factor 3.5-6.5 and in M 
by a factor 1.4-2.8. If an analysis were to proceed using ex- 
tant shear measurement methods, accounting for residual 
systematic biases would necessarily enlarge the reported er- 
ror bars — if all galaxies were at detection S/N=10, 95% of 
realisations of bias would simultaneously satisfy |fo|/cr< 3.7 
for A and |6|/a-< 1 for M (see figurejs]). 

Shape measurement algorithms can be improved either 
by fundamental progress or by calibration on accurate sim- 
ulated images. Extrapolating the current rate of fundamen- 
tal development ( jKitching et al.|2012b[ ) suggests that, with 
even minimal continued development, the required algorith- 
mic performance will be surpassed, and substantial margin 
will be achieved, well before the need to analyse Stage IV 
surveys. Indeed, noise bias ( [Kacprzak et al.||2012[ |Melchior| 
fc Viola|2012l ) was unaccounted for by all GREATIO meth- 
ods, but appears in faint galaxies at a level consistent with 
its being the dominant source of bias ( [Refregier et al.|2012[ ). 
Proper treatment of noise bias will therefore improve per- 
formance for faint galaxies. Several additional improvements 
have also been suggested (e.g. |Bernstein||2010| [Viola, Mel-| 
chior fc Bartelmann[[20lT I. For the first time, methods are 
thus emerging with sufficient accuracy to reliably and fully 
exploit the statistical potential of Stage IV cosmic shear 
surveys. Simulations could then be used solely as external 
verification tests of data analysis pipelines. Dedicated sim- 
ulation efforts are continuing inside the teams of all weak 
lensing surveyj^ and the GREAT3 programme (Mandel 



baum, Rowe et aL][in prep. I is currently being designed by 



a worldwide collaboration of the weak lensing community. 

5.2 Empirical diagnosis of residual additive 
systematics 

Although the greatest improvement is formally required in 
additive cosmic shear measurement biases, they are poten- 
tially the least troublesome. Many additive systematics can 
be internally diagnosed within a shear catalogue, and those 
that do arise can potentially even be calibrated out at the 
catalogue level. This procedure has a long heritage in Hub- 
ble Space Telescope (HST) analyses {e.g. Rhodes et al 
Miralles et al., 20041 [Rhodes et al.[|2007| [Jee et al 



2004 



2007 



Schrabback et al.|2010[[Jee et al.[2011|[Hoekstra et al.[2011 1 



^ The GREATIO simulations used ground-based PSF morpholo- 
gies, but STEPS (see [http : //www . roe . ac . uk/ -heymans / step/ 
[step3_results .html") concluded that the only factor affecting 
shear measurement performance was the ratio of the PSF size 
to the pixel size. STEPS was a space-based equivalent of STEP2, 
run as another public, blind competition. Its results were never 
published because they were essentially identical to those from 



STEP2.The main conclusion was that equivalent shear measure- 
ment performance could obtained for small galaxies from space as 
for similarly-resolved larger galaxies from the ground, irrespective 

of PSF morphology. 

* See www.darkenergysurvey.org www. Isst . com', 'www . ' 
euclid-ec.org, http://www.naoj . org/Projects/HSC/^ |Rhodes[ 
et al.[([2012[|. 
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5.2.1 Calibrating PSF model errors 

The best way to internally diagnose PSF modelling errors 
SRpsF and depsp is to bootstrap real stellar shapes. The 
PSF model can be constructed from all but a few of the avail- 
able stars, then interpolated to the positions and colours of 
the remaining stars as well as the galaxies. Any offset be- 
tween the predicted and measured values will be a sum of 
SRpsF + SRohB and 5epsF + <5£obs, but the observational con- 
tributions should average to zero over a large population of 
stars. The number of degrees of freedom in PSF variation 



due to thermomechanical instability (Jarvis & Jain 2005 



Rhodes et al. 2007 Schrabback et al. 2010 1, atmospheric tur- 
bulence ( Jarvis, Schechter fc Jain|2009 1 or changing gravity 
load (lye et al. 2004) can also be usefully compared to engi- 



neering predictions from raytracing through optics models 
( Kristlp95l|Hook fc Stoehrl[2008t . 

A vital test of successful PSF deconvolution is obtained 
from the correlation of measured shears with the PSF el- 
lipticity. No residual signature of the system's PSF should 
find its way into the galaxy shape catalogue, so these should 
be uncorrelated. However, in a flawed shear measurement, 
taking unweighted epsF = epsf + ^epsp and -y^ from equa- 
tions (541 and (55 1, we obtain 



(Tw-Epsf) = 



(IspsfI 



RpSF 



P-iPrPbpsf -RL 



rtpgp 



2 5i?NC 



Roh 



S{RL 



_l " \- 

RohsiRohB — Rnc) RpSF 



PrR^ 



RpSF 



PjPr i?g^i 



(Set 



;gai 

■gal 
EpSF 



Rnc 
SPr 



(76) 



plus many more terms of order 0{5^), including some pro- 



portional to equations ( 83 1 and ( 84 1 . While it would be dif- 



ficult to identify and then calibrate out any individual con- 
tribution from this mixed observable, it can be used as an 
invaluable post facto check that other techniques have suc- 
cessfully removed almost all of the additive cosmic shear 
systematics. 



5.2.2 Calibrating residual detector effects 

Non-convolution detector effects can accumulate in space- 
based instruments over time, as radiation damages the hard- 
ware. Thus any long-term, monotonic drift in the mean 
{Robs) or (eobs) within each exposure - or, even better, 
within a calibration field that can be returned to - indicates 
a nonzero (5i?NC or ^enc- 

Many detector effects also exhibit a characteristic de- 
pendence upon chip position. This is most notable for 
Charge Transfer Inefficiency in CCDs, where the image 
degradation increases linearly with distance y from the read- 
out register ( Massey et al.||2bl0 l, where i/max is the size of 
the CCD. In this case, correlating shear measurements with 
chip position, or fitting shear measurements as a function of 



chip position, measures nonzero 

_ 1 PrRIh,! + RpsF 



7v 



-^gal 



P^PrPbj^c 

2 (epsp) -RpsF SRis 



P-iPrP, 



1 



epSF Rga.1 Rohs 



Rt^ 



P^PrP^, 



Rl 



{Sepsf) + 



(epsp) 



+ 



RpSP -Robs (-Robs — -Rnc) 

PRRgB.1 SPr 

^PSF(f'-R-Rgal + ^PSf) Pr , 



(77) 



where we assume <5eNcl!/„ax ^-nd 5-RNc|j;max s-re constant 
over a sufficiently long time period to gather statistically 
significant measurements. If (epsp) = and all other (PSF, 
observational) errors were zero, this would be a direct test of 
(Jenc- However, the reality that (771 contains terms mixed 



with residual PSF modelling errors has made analysis of 
HST data challenging. Only by first verifying the PSF model 



with tests from Section 5.2.1 Rhodes et al. (20071; Schrab- 



back et al. (20101; Hoekstra et al. (20111 were able to sub- 



tract this measurement of (Jenc from a shear catalogue, fol- 



lowing equation (341. However, such an empirical, catalogue- 



level correction should be seen as a last resort because it 
addresses neither 5-Rnc nor the mixing between sources of 
error whereby an imperfect removal of additive systemat- 
ics can introduces an (undiagnosable) multiplicative cosmic 
shear systematic. A much more robust technique, demon- 



strated by Leauthaud et al. (20101, is to apply a physically- 



motivated correction scheme at the pixel level as the first 
process during data reduction (e.^. Massey|[2010 Anderson 



fc Bedin|2010 1 . The performance of this technique can again 
be tested via equation (771, and improved by iteration 



5.3 Impact of residual multiplicative systematics 

Multiplicative cosmic shear measurement biases are poten- 
tially the most troublesome, because there is no known 
cosmology-independent way to accurately diagnose residual 
multiplicative bias internally within a dataset (except that 
it may leak weakly into a small unphysical _B-mode signal 
('Vale"2006|, but so do many things). Analyses must either 
rely upon theoretical calculations of the shear calibration, 
or test a measurement pipeline on simulated images that 
contain a known signal and rely upon the veracity of those 
simulations. Since multiplicative systematic errors are thus 
more problematic than additive errors, and because the re- 
quirements on them are similarly hard to meet, we shall 
investigate them more carefully. 

Rather than considering galaxies all of the same size 
and detection S/N, we shall now consider a realistic, full 
population of galaxies. Some galaxies are bigger and brighter 
than others, and it will be easier to measure their shapes. 
The form of equation ( |55[ ) suggests that multiplicative shape 
measurement biases predominantly depend upon the relative 
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size of the PSF and the surveyed galaxies 



mo + 



mi 



(78) 



This characteristically quadratic performance was indeed 
apparent in many of the methods tested in STEP^ |Massey 
|et al.|2007b[ top-right panels of figure 7) . Similar behaviour 
is suggested in GREAT08 ( [Bridle et al.|2010[ figure C3) and 
is explicitly fitted in GREATfO ( [Kitching et al.||2012a 
pendix B5) as 



ap- 



m ~ mn + 01,2 



2 

PSF 



(79) 



3.55^ pixels^ (av- 



where (_Rpgp) — 3.4^ pixels^ 
eraging the contribution of the bulges and discs), and the 

0.005 ([Kitchin g et al.|2012a 



best methods achieve a 

figure 5). Note that GREATfO's fiducial PSE had a Mof 
fat profile, for which Pr ~ 1. Diffraction-limited surveys 
with Pr ~ 2 will likely achieve better performance although, 
since that was only tested in a subset of the GREATIO data 
whose results were dominated by method bias, we shall con- 
servatively assume only the performance explicitly demon- 
strated. 

We showed in Section 14.31 that constraints on the na- 
ture of dark energy are largely insensitive to a constant 
multiplicative bias mo. The achieved value of mi is thus 
likely to be the driving requirement for success. We shall 
baseline a currently achievable performance of mo ~ and 
mi ~ 0.006. We shall then fold through the observed dis- 
tribution of galaxies sizes to consider the prospects of two 
generic regimes proposed for future survey^^ a space-based 
mission with a PSF Full Width at Half Maximum (FWHM) 
of 0'.'2 and a ground-based telescope with a FWHM seeing 
of 0'.'7. 



5.3.1 Two dimensional cosmic shear 

To quantify the typical size of galaxies in the Universe as 
a function of magnitude, we measure the sizes of galaxies 
in iTTsw-band observations of the HST Ultra Deep Field 
(UDF; [Beckwith et alT[[2003| ) (Figure [Ijj,) . To compute the 
approximate intrinsic size of the galaxies, we assume that 
their profiles are Gaussian (with a FWHM equal to their 
measured FWHM), and that the ACS PSF has a FWHM of 
O'.'l. Fainter galaxies are smaller (Figure |4j3) but, down to 



Z775W ~ 26, most are intrinsically larger than the ACS PSF. 

Many more galaxies are resolved (-Robs > 1.25-Rpsf) by 
the hypothetical space-based mission than the hypothetical 
ground-based survey (Figure |4j;) . Crucially, most galaxies 
in space-based observations are not only resolved but very 
uie/Z resolved. Following (|78|, this naturally leads to a better 



^ In STEP2, methods that applied an overall 'calibration factor' 
from analysis of independent simulated images (e.g. TS and sev- 
eral not plotted) appear to have achieved (m) by adjusting 
mo such that m((figai)) = for galaxies of average size. 

A survey's effective iipsp may be a complicated function of 
the system PSF at different times. Some state-of-the-art shear 
measurement algorithms downweight the contribution from ex- 
posures with poor seeing. This improves the effective ZJpgp, at a 
cost of decreased imaging depth. 
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Figure 4. Prospects for 2D weak gravitational lensing surveys. 
Panel (a) : The observed size i?obs ^'rid j-band magnitude of ob- 
jects in the UDF. The vertical dashed line indicates the size of 
the ACS PSF. Panel (b): Galaxies' average intrinsic size i?gai 
as a function of magnitude, under the assumption that the galax- 
ies and PSF have Gaussian profiles. The error bars indicate the 
dispersion in -Rgai. Panel (c): The cumulative number density 
of resolved galaxies as a function of (limiting) magnitude, with 
sizes iJobs > 1.25-RpsF (thick lines) or -Robs > l-l^PSF (thin 
lines) . The dashed lines correspond to a space based mission with 
a FWHM= 0'.'2 for the PSF. Note that Euclid's wide-band ob- 
servations to magnitude 24.5 correspond roughly to ir^sw ~ 25.2 
(vertical dotted line). The solid curves are for a typical ground 
based PSF with FWHM= 0('7. Panel (d): Predicted shear mea- 
surement bias for the best current methods, averaged over the 
population of resolved galaxies. Requirement ( |74| is shown as a 
horizontal dotted line, assuming Ai ~ 2m J 17k. 



shear measurement bias (Figure|4|i). For a full, realistic pop- 
ulation of source galaxies in a 2D cosmic shear survey from 
space, current shear measurement performance satisfies re- 



quirement (741, in the absence of PSF variation or detector 
effects. Any subsequent improvement will provide increased 
margin for imperfect PSF and detector models. 

Ground-based surveys face two problems. First, a 
greater improvement in shape measurement techniques is re- 
quired for them to reach their full potential than space-based 
surveys (Figure [4ji). This is simply because of the difficulty 
resolving galaxies from the ground, without even taking into 
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Figure 5. Prospects for 3D weak gravitational lensing surveys. 
Panel (a): The density of galaxies as a function of photometric 
redshift Zphot for galaxies with 20 < irrsw < 26.5 (results do not 
depend strongly on the choice of limiting magnitude). Panel (b): 
Average galaxy size (Rg^i) as a function of redshift, under the 
assumption that the galaxies have Gaussian profiles. The error 
bars indicate the dispersion in i?gai- Panel (c): Multiplicative 
shear calibration bias m as a function of redshift for galaxies 
with sizes i?obs > l-25iJpsF (thick lines) or Robs > 1-lRpSF (thin 
lines). The dashed line corresponds to a space-based mission with 
a PSF of FHWM= 0''2, and the solid curve is for typical ground- 
based seeing with FHWM= 0"7. Requirement | |74[ | is shown as a 
dotted line, assuming A4 ^ 2m \n\ . 



account the much harder task of modelling the PSF due to a 
turbulent atmosphere and more variable physical conditions. 
Second, even in extremely deep images covering the entire 
sky, not enough galaxies are resolved (Rohs > l-25i?psF) 
to obtain statistical measurement errors on w competitive 
with other techniques (Figure [4|;). More galaxies could be 
included in an analysis by lowering the size culp^ for ex- 
ample to Rohs > l.li?psF- Increasing the density of galaxies 
reduces statistical measurement error, but at a cost of even 
more rapidly increasing systematic bias, such that current 
methods do not meet requirements. 



5.3.2 Three dimensional cosmic shear 

Three dimensional cosmic shear analysis requires measure- 
ments of both shear and redshift for each galaxy, and for 



It is far more effective to add small galaxies than faint ones, es- 
pecially for a ground-based survey, because faint galaxies are also 
so much smaller. In practice, our crude step-function cut is also 
usually replaced by a smoothly varying weight function l |Hoek-| 
|stra, Fran x & Kuijkcn 2000); the result of this will lie between 
the two extremes we have considered. 



the shears to be measured without (even relative) bias as a 
function of redshift ( [Kitching, Heavens fc Miller|[20TT| . To 
estimate this bias in a real population of galaxies, we use 
photometric redshift estimates for 20 < irrsw < 26.5 galax- 
ies in the HST UDF bylCoe et al.|(|2006|). The distribution 



of best-fit redshifts peaks around z ~ 0.5 but also samples 
a long tail out to z ~ 3 (Figure [5|i). Beyond redshift 2; ~ 3, 
the scarcity of UDF galaxies makes our statistics unstable. 

The mean and rms apparent size of galaxies decrease no- 
ticeably above z ~ 1.5-2 (Figure [sja). Multiplicative shear 
measurement bias will therefore get slightly worse at high 
redshift (Figure |5|;) . For a space-based survey, meeting re- 
quirements in every redshift bin will demand algorithms 
with multiplicative biases a factor 1.8-2.2 better than cur- 
rent methods (which could come from calibration on very 
accurate simulated images). Note that this analysis is com- 
pletely independent of that in Section fS.ll That their conclu- 
sions are so consistent lends support to both methodologies. 

Ground-based observations are more profoundly af- 
fected by the decrease in galaxy size at z ^ 1.5. Very deep 
images will help, because some fraction of systematics is 
doubtless due to noise bias (Refregier et al.|2012[ ). However, 
a dramatic improvement in shear measurement methods will 
be required for ground-based surveys to span the high red- 
shifts needed to probe the growth of structure. As before, 
this argument is based purely on the small size of galaxies 
compared to a ground-based PSF, and does not take into 
account additional challenge of modelling the more variable 
ground-based PSF. 



6 DISCUSSION AND CONCLUSIONS 

We have derived expressions showing how various sources of 
error in galaxy shape measurement propagate into additive 
biases A (eqn.|57| and multiplicative biases M (eqn. |59| on 
cosmic shear results. Additive biases include a contribution 
from mis-estimation of a telescope's PSF shape, and multi- 
plicative biases include mis-estimation of the PSF size. This 
agrees with the behaviour generically seen in empirical tests 
of shear measurement methods. For the first time, we have 
also propagated into cosmic shear results the consequences 
of imperfect correction for non-linear detector effects, and 
imperfect image processing algorithms. 

We have ascertained the maximum level of additive 
biases A{£, z) (eqn. 



721 and multiplicative biases M{l,z) 



(eqn. 74 1 that can be tolerated by a next-generation cosmic 
shear survey attempting to constrain the dark energy equa- 
tion of state parameter w to within 0.065 (68%CL). Cosmic 
shear measurements of w are surprisingly insensitive to a 
constant multiplicative bias. To explore more generic scale- 
and redshift-dependent systematic biases, we have used a 
form-filling technique; based upon the 95% confidence limit 
averaged equally over all possible functional form s, we de- 
fine convenient requirements on mean A and M. Cropper 



et al. (20121 distribute this overall requirement into budgets 



on the individuals sources of error (PSF knowledge, detector 
knowledge, accuracy of shape measurement algorithms) in 
an allocation that is suitable for a real space mission. 

We compare our requirements on galaxy shape mea- 
surement software to the performance seen recently in the 
public, blind GREATIO challenge. Extant shear measure- 
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ment methods meet both requirements for a Stage IV weak 
lensing surveys, for bright galaxies at detection S/N=40 or 
for a 2D cosmic shear survey from space in which the con- 
tributions from a large population of galaxies are combined. 
This will generally not provide sufficient galaxies to meet 
Stage IV surveys' goals for the statistical errors on cosmo- 
logical parameters. This also assumes that the telescope and 
instrument hardware can be well modelled; a modest im- 
provement will create margin for imperfect modelling and 
correction of the system PSF or detector effects. 

Fully exploiting the statistical potential of Stage IV 
weak lensing surveys will require shear measurement soft- 
ware that works more accurately than current algorithms on 
faint galaxies. Current algorithms could introduce system- 
atic biases of the same order of magnitude as the statistical 
errors, and the total reported confidence limits would need 
to be enlarged by a factor ~\/2 to account for this effect. 
To be sure of avoiding this problem, if all galaxies were de- 
tected at S/N=20-10 and all of them were used, additive 
biases must be reduced by a factor 3.6-6.5. However, many 
tests can be used to identify and remove portions of a shear 
catalogue with additive biases; we have used our new for- 
malism to show exactly what each test is sensitive to. Using 
an entire, realistic population of faint galaxies would also 
need a reduction in multiplicative bias by a factor 1.4-2.8. 
Averaging over a realistic galaxy population extending to 
z ^ 1.5, a space-based 3D cosmic shear analysis will need an 
improvement in multiplicative bias by a factor 1.8-2.2. No 
internal tests can identify multiplicative biases, so the great- 
est development effort should be spent to minimise these. 

Several new ideas for image analysis techniques are be- 
ing discussed in the literature, and ongoing simulation pro- 
grammes show potential. The past decade has seen steady 
improvement in shape measurement algorithms; extrapolat- 
ing even minimal continued development suggests that the 
required algorithmic performance will be met well before the 
need to analyse Stage IV surveys. Importantly, it will be at 
least 3-5 times easier to meet requirements for high resolu- 
tion space-based rather than ground-based surveys, because 
multiplicative biases depend (theoretically and empirically) 
on the inverse square of the S/N and the square of the PSF 
size. This conclusion that ground-based surveys will require 
much better shear measurement methods than space-based 
surveys arises solely because they do not resolve galaxies 
well, and does not even take into account the additional chal- 
lenge of modelling atmospheric turbulence or more rapidly 
changing physical conditions. 
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APPENDIX A: REMAINING CROSS TERMS 

In Section [3] we ignored several cross terms in earlier cal- 
culations of the additive cosmic shear systematic A because 
we expect their contributions to be subdominant as long as 
the PSF model, detector characterisation and shape mea- 
surement method are working properly. However, tests for 
the presence of these terms in real data could be a useful, 
cosmology-independent way to verify that the pipeline is 
meeting requirements. We shall now discuss four notewor- 
thy order 0{S) terms that potentially add to A. These are 



(egai.epsp) 



+ 



Rohs — RnC 



+ 



(80) 



in the presence of the selection bias discussed by [Hirata fc] 
|Seljak| ( |2003[ ), whereby galaxies are more likely to be de- 
tected if their intrinsic shapes are similar to that of the PSF; 
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{egal.^Eobs) 



••gal " 



p2 

-Tlpgp 



(81) 



if, for example, a faulty shape measurement method system- 
atically truncates the isophotes of elliptical galaxies; and 



+ (egal-^ENc) 



-Rgal + R 



•PSF 



p2 

-fXpgp 



(82) 



with Charge Transfer Inefliciency, for which Se^c depends 
on Egai ( Rhodes et al.|2010 |; and 



(Egal.^epSp) 



(83) 



if some small galaxies (which have been sheared, so correlate 
with their neighbours) are accidentally confused with stars 
and allowed to contribute towards the PSF model. Of all 



these, the first two terms of ( 80 1 are likely to be the most 



problematic: the first because stars and galaxies have differ- 
ent colours, so a PSF model naively obtained from stars will 
be systematically too large, and the second because model 
inaccuracies in nonlinear correction will likely dominate vari- 
ations in the effect across the detector. 

There are also several terms of order 0{5^). Two that 
may feasibly have nonzero coefficients are 



-rtpgp 



■epsF 



"gal 



(84) 



if the PSF modelling errors depend upon the ellipticity of a 
complex PSF whose shape changes as a function of radius; 
and 



+ (5epsF.5eNc) 



RvSviRpSV + ^ga 

RL, 



(85) 



if residuals from the correction of nonlinear detector effects 
also contaminate the bright stars from which the PSF is 
modelled. 

Finally, we also ignored cross terms like (m-yc) in the 
correlation functions. Ideally, (mfc) — (mc) (7) and (7) = 
0, but this latter equality does not hold in the presence of.Hi-j 



rata & Seljak (2003) selection biases. Furthermore, we have 



shown that m and c are both correlated with SRpsf and 
therefore with each other, so the prefactor may be consider- 
able. This sort of combination could give rise to a whole new 
slew of potential intrinsic-intrinsic, intrinsic-c, intrinsic-m, 
etc. systematics. We shall explore these in future work. 



APPENDIX B: PERFORMANCE INDICATORS 
USED IN GREATIO 



In equations ( |70[) a nd (71 1, we introduced performance in- 
dicators A and M, based upon integrals over a range of 



scales . For consistency with earlier work ( Amara & Refregier 
2008), we chose to weight the scales by ^ dln^, but differ- 



ent choices could have been made. Integration with respect 
to d£ typically raises the numerical value of ^ by ~ 10% 
and by ~ 3%. A similar loosening would also need to be 
applied to the numerical value of the requirements, and this 
is a negligible change. Including weighting by C{£)d£ inside 



the integrals in ( 71 1 rescales the performance indicator and 



requirement so that they have a numerical value similar to 
A. However, this would mean losing intuition from previ- 
ous studies and make the requirements formally cosmology- 
dependent. Furthermore, since the shape of the C{£) weight 
approximately recovers that of £^ dln^, changes to numeri- 
cal values are even smaller than the previous option. 

Practical considerations forced the measurements in 
GREATIO ( jKitching et al.|2012a| ) to use a different range in 
£ and a different weight function. It is important to consider 
the effect of this, because we use the GREATIO results as an 
indication of current best performance. The GREATIO anal- 
ysis measured C'{£) at linearly separated values £ = {233, 
415, 600, 789, 977, 1162, 1350, 1538}, then found the least- 
squares fitting function (1 -I- A4)C{£) + A with constant 
A — Ag and M = A4g- This process thus minimises 



Therefore 



dA 



= 2J2 - (1 + M)C{£) -A^=0 



(86) 



(87) 



so, ii M = 0, 



^^ = ^(C(^)-C(£) 



Approximating the discrete sums with constant A£ as con- 
tinuous integrals, and remembering that A — Ag is constant 
so can be extracted from the integrals. 



Ag = 



^/;— (C(£)-C(^)) d£ 



27r Ji, 



d£ 



(89) 



This is similar to equation ( |70[ ) , although a version in which 
the various £ scales are weighted differently. The different 
weighting changes our conclusions by less than 10%, so we 
ignore this small perturbation. 

Least-squares fitting also guarantees that 

ax! 

dM 



-2 J2 {C{£) - (1 + M)C{£) 



A] ^0 



(90) 



so, if ^ = 0, 



Mc 



j:C{£)iC(£)-C{£)) 
Y.(C{£)f 

C{£){d{£)-C{£))d£ 



(91) 



(92) 



^/;— (c(/))2 d£ 

This again is merely a differently-weighted version of equa- 



tion (71 1, with negligible effect upon our conclusions. 
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